Futility Scaling: High-Associativity Cache Partitioning (Extended Version)

نویسندگان

  • Ruisheng Wang
  • Lizhong Chen
  • Ming Hsieh
چکیده

As shared last level caches are widely used in many-core CMPs to boost system performance, partitioning a large shared cache among multiple concurrently running applications becomes increasingly important in order to reduce destructive interference. However, while recent works start to show the promise of using replacement-based partitioning schemes, such existing schemes either suffer from severe associativity degradation when the number of partitions is high, or lack the ability to precisely partition the whole cache which leads to decreased resource efficiency. In this paper, we propose Futility Scaling (FS), a novel replacement-based cache partitioning scheme that can precisely partition the whole cache while still maintain high associativity even with a large number of partitions. The futility of a cache line represents the uselessness of the line to application performance and can be ranked in different ways by various policies, e.g., LRU and LFU. The idea of FS is to control the size of a partition by properly scaling the futility of its cache lines. We study the properties of FS on both associativity and sizing in an analytical framework, and we present a feedback-based implementation of FS that incurs little overhead in practice. Simulation results show that FS improves performance over previously proposed Vantage and PriSM by up to 6.0% and 13.7%, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Way-based Cache Partitioning for Low-Associativity Cache

Cache Partitioning is well-known technique to reduce destructive interference among co-running applications in a shared last-level cache (SLLC). Way-based cache partitioning is a popular partitioning scheme due to its simplicity, but it can dramatically reduce associativity of each partition. Also, most SLLC have limited associativity because the higher associativity causes the higher cache acc...

متن کامل

The Vantage Cache-partitioning Technique Enables Configurability and Quality-of-service Guarantees in Large-scale Chip Multiprocessors with Shared Caches. Caches Can Have Hundreds of Partitions with Sizes Specified at Cache Line Granularity, While Maintaining High Associativity and Strict Isolation among Partitions

......Shared caches are pervasive in chip multiprocessors (CMPs). In particular, CMPs almost always feature a large, fully shared last-level cache (LLC) to mitigate the high latency, high energy, and limited bandwidth of main memory. A shared LLC has several advantages over multiple, private LLCs: it increases cache utilization, accelerates intercore communication (which happens through the cac...

متن کامل

Reconfiguring Cache Associativity: Adaptive Cache Design for Wide-Range Reliable Low-Voltage Operation Using 7T/14T SRAM

This paper presents an adaptive cache architecture for wide-range reliable low-voltage operations. The proposed associativityreconfigurable cache consists of pairs of cache ways so that it can exploit the recovery feature of the novel 7T/14T SRAM cell. Each pair has two operating modes that can be selected based upon the required voltage level of current operating conditions: normal mode for hi...

متن کامل

Dynamic Cache Partitioning for Simultaneous Multithreading Systems

This paper proposes a dynamic cache partitioning method for simultaneous multithreading systems. We present a general partitioning scheme that can be applied to setassociative caches at any partition granularity. Furthermore, in our scheme threads can have overlapping partitions, which provides more degrees of freedom when partitioning caches with low associativity. Since memory reference chara...

متن کامل

Energy efficient cache architectures for single, multi and many core processors

With each technology generation we get more transistors per chip. Whilst processor frequencies have increased over the past few decades, memory speeds have not kept pace. Therefore, more and more transistors are devoted to on-chip caches to reduce latency to data and help achieve high performance. On-chip caches consume a significant fraction of the processor energy budget but need to deliver h...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014